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Formal synthesis is the process of generating a program satisfying a high-level formal specihcation. 
In recent times, effective formal synthesis methods have been proposed based on the use of inductive 
learning. We refer to this class of methods that learn programs from examples as formal inductive 
synthesis. In this paper, we present a theoretical framework for formal inductive synthesis. We 
discuss how formal inductive synthesis differs from traditional machine learning. We then describe 
oracle-guided inductive synthesis (OGIS), a framework that captures a family of synthesizers that 
operate by iteratively querying an oracle. An instance of OGIS that has had much practical impact 
is counterexample-guided inductive synthesis (CEGIS). We present a theoretical characterization 
of CEGIS for learning any program that computes a recursive language. In particular, we analyze 
the relative power of CEGIS variants where the types of counterexamples generated by the oracle 
varies. We also consider the impact of bounded versus unbounded memory available to the learning 
algorithm. In the special case where the universe of candidate programs is hnite, we relate the speed 
of convergence to the notion of teaching dimension studied in machine learning theory. Altogether, 
the results of the paper take a first step towards a theoretical foundation for the emerging held of 
formal inductive synthesis. 


1 Introduction 

The field of formal methods has made enormous strides in recent decades. Formal verification techniques 
such as model checking ifTSl l47l [T^ and theorem proving (see, e.g. ll45]|36l|22l) are used routinely in 
the computer-aided design of integrated circuits and have been widely applied to find bugs in software, 
analyze models of embedded systems, and find security vulnerabilities in programs and protocols. At 
the heart of many of these advances are computational reasoning engines such as Boolean satisfiability 
(SAT) solvers BTI . Binary Decision Diagrams (BDDs) ifT^ . and satisfiability modulo theories (SMT) 
solvers [8 ]. Alongside these advances, there has been a growing interest in the synthesis of programs or 
systems from formal specifications with correctness guarantees. We refer to this area as formal synthesis. 
Starting with the seminal work of Manna and Waldinger on deductive program synthesis ll42l and Pnueli 
and Rosner on reactive synthesis from temporal logic ll46l . there have been several advances that have 
made formal synthesis practical in specific application domains such as robotics, online education, and 
end-user programming. 

Algorithmic approaches to formal synthesis range over a wide spectrum, from deductive synthesis 
to inductive synthesis. In deductive synthesis (e.g., ll42l i. a program is synthesized by constructively 
proving a theorem, employing logical inference and constraint solving. On the other hand, inductive 
synthesis |[T^ l57l l52ll seeks to find a program matching a set of input-output examples. At a high 
level, it is thus an instance of learning from examples, also termed as inductive inference or machine 
learning MM- Many current approaches to synthesis blend induction and deduction in the sense that 
even as they generalize from examples, deductive procedures are used in the process of generalization 
(see lISTl for a detailed exposition). Even so, the term “inductive synthesis” is typically used to 
refer to all of them. We will refer to these methods as formal inductive synthesis to place an emphasis 
on correctness of the synthesized artifact. These synthesizers generalize from examples by searching a 
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restricted space of programs. In machine learning, this restricted space is called the concept class, and 
each element of that space is often called a candidate concept. The concept class is usually specified 
syntactically. It has been recognized that this syntax guidance, also termed as a structure hypothesis, can 
be crucial in helping the synthesizer converge quickly to the target concept 115511511 111. 

The fields of formal inductive synthesis and machine learning have the same high-level goal: to de¬ 
velop algorithmic techniques for synthesizing a concept (function, program, or classifier)/rom observa¬ 
tions (examples, queries, etc.). However, there are also important differences in the problem formulations 
and techniques used in both fields. We identify some of the main differences below: 

1. Concept Classes: In traditional machine learning, the classes of concepts to be synthesized tend to 
be specialized, such as linear functions or half-spaces llhTl . convex poly topes 1251 . neural networks of 
specific forms ['91, Boolean formulas in fixed, bounded syntactic forms ll26l . and decision trees Il48l . 
However, in formal synthesis, the target concepts are general programs or automata with constraints 
or finite bounds imposed mainly to ensure tractability of synthesis. 

2. Learning Algorithms: In traditional machine learning, just as concept classes tend to be specialized, 
so also are the learning algorithms for those classes ll4^ . In contrast, in formal inductive synthesis, 
the trend is towards using general-purpose decision procedures such as SAT solvers, SMT solvers, and 
model checkers that are not specifically designed for inductive learning. 

3. Exact vs. Approximate Learning: In formal inductive synthesis, there is a strong emphasis on exactly 
learning the target concept; i.e., the learner seeks to find a concept that is consistent with all positive 
examples but not with any negative example. The labels for examples are typically assumed to be 
correct. Moreover, the learned concept should satisfy a formal specification. In contrast, the emphasis 
in traditional machine learning is on techniques that perform approximate learning, where input data 
can be noisy, some amount of misclassification can be tolerated, there is no formal specification, and 
the overall goal is to optimize a cost function (e.g., capturing classification error). 

4. Emphasis on Oracle-Guidance: In formal inductive synthesis, there is a big emphasis on learning in 
the presence of an oracle, which is typically implemented using a general-purpose decision procedure 
or sometimes even a human user. Moreover, and importantly, the design of this oracle is part of the 
design of the synthesizer. In contrast, in traditional machine learning, the use of oracles is rare, and 
instead the learner typically selects examples from a corpus, often drawing examples independently 
from an underlying probability distribution. Even when oracles are used, they are assumed to be black 
boxes that the learner has no control over. The oracle is part of the problem definition in machine 
learning, whereas in formal inductive synthesis, the design of the oracle is part of the solution. 

The last item, oracle-guidance, is a particularly important difference, and informs the framework we 
proposed in this paper. 

In this paper, we take first steps towards a theoretical framework and analysis of formal inductive 
synthesis. Most instances of inductive synthesis in the literature rely on an oracle that answers different 
types of queries. In order to capture these various synthesis methods in a unifying framework, we for¬ 
malize the notion of oracle-guided inductive synthesis (OGIS). While we defer a detailed treatment of 
OGIS to Section]^ we point out three dimensions in which OGIS techniques differ from each other: 

1. Characteristics of concept class: The concept class for synthesis may have different characteristics 
depending on the application domain. For instance, the class of programs from which the synthesizer 
must generate the correct one may be finite, as in the synthesis of bitvector programs |[55l[^l24]l . or 
infinite, as in the synthesis of guards for hybrid automata 1^13^ . In the former case, termination is 
easily guaranteed, but it is not obvious for the case of infinite-size concept classes. 

2. Query types: Different applications may impose differing constraints on the capabilities of the ora¬ 
cle. In some cases, the oracle may provide only positive examples. When verification engines are 
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used as oracles, as is typical in formal synthesis, the oracle may provide both positive examples and 
counterexamples which refute candidate programs. More fine-grained properties of queries are also 
possible — for instance, an oracle may permit queries that request not just any counterexample, but 
one that is “minimum” according to some cost function. 

3. Resources available to the learning engine: As noted above, the learning algorithms in formal in¬ 
ductive synthesis tend to be general-purpose decision procedures. Even so, for tractability, certain 
constraints may be placed on the resources available to the decision procedure, such as time or mem¬ 
ory available. For example, one may limit the decision procedure to use a finite amount of memory, 
such as imposing an upper bound on the number of (learned) clauses for a SAT solver. 

We conduct a theoretical study of OGIS by examining the impact of variations along the above three di¬ 
mensions. Our work has a particular focus on counterexample-guided inductive synthesis (CEGIS) ll55ll . 
a particularly popular and effective instantiation of the OGIS framework. When the concept class is in¬ 
finite size, termination of CEGIS is not guaranteed. We study the relative strength of different versions 
of CEGIS, with regards to their termination guarantees. The versions vary based on the type of coun¬ 
terexamples one can obtain from the oracle. We also analyze the impact of finite versus infinite memory 
available to the learning algorithm to store examples and hypothesized programs/concepts. Finally, when 
the concept class is finite size, even though termination of CEGIS is guaranteed, the speed of termination 
can still be an issue. In this case, we draw a connection between the number of counterexamples needed 
by a CEGIS procedure and the notion of teaching dimension ll^ previously introduced in the machine 
learning literature. 

To summarize, we make the following specific confribufions in fhis paper: 

1. We define the formal inductive synthesis problem and propose a class of solution fechniques formed as 
Oracle-Guided Inductive Synthesis (OGIS). We illusfrafe how OGIS generalizes insfances of concepf 
learning in machine learning/arfificial infelligence as well as synfhesis fechniques developed using 
formal mefhods. We provide examples of synfhesis fechniques from liferafure and show how fhey can 
be represenfed as insfanfiafions of OGIS. 

2. We perform a fheorefical comparison of differenf insfanfiafions of fhe OGIS paradigm in ferms of 
fheir synthesis power. The synfhesis power of an OGIS fechnique is defined as fhe class of con- 
cepfs/programs (from an infinite concepf class) fhaf can be synfhesized using fhaf fechnique. We 
esfablish fhe following specific novel fheorefical resulfs: 

• For learning engines fhaf can use unbounded memory, fhe power of synfhesis engines using oracle 
fhaf provides arbifrary counterexamples or minimal counferexamples is fhe same. Buf fhis is sfricfly 
more powerful fhan using oracle which provides counterexamples which are bounded by fhe size of 
fhe posifive examples. 

• For learning engines fhaf use bounded memory, fhe power of synfhesis engines using arbifrary coun¬ 
ferexamples or minimal counferexamples is still fhe same. The power of synfhesis engines using 
counferexamples bounded by posifive examples is nol comparable fo fhose using arbifrary/minimal 
counferexamples. Confrary fo infuifion, using counferexamples bounded by positive examples al¬ 
lows one fo synfhesize programs from program classes which cannof be synfhesized using arbifrary 
or minimal counterexamples. 

3. For finite concepf classes, we prove fhe NP hardness of fhe problem of solving fhe formal inducfive 
synfhesis problem for finife domains for a large class of OGIS fechniques. We also show fhaf fhe 
teaching dimension Il20l of fhe concepf class is a lower bound on fhe number of counferexamples 
needed for a CEGIS technique fo ferminafe (on an arbifrary program from fhaf class). 

The resf of fhe paper is organized as follows. We firsf presenf fhe Oracle Guided Inductive Synthesis 
(OGIS) paradigm in Section We discuss related work in Section We presenf fhe nofafion and 
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definitions used for theoretical analysis in Section followed by the theoretical results and their proofs 
in Section and Section We summarize our results and discuss open problems in Section A 
preliminary version of this paper appeared in the SYNT 2014 workshop 133. 


2 Oracle-Guided Inductive Synthesis: OGIS 

We begin by defining some basic terms and notation. Following standard terminology in the machine 
learning theory community [4], we define a concept c as a set of examples drawn from a domain of 
examples E. In other words, c C E. An example x G E can be viewed as an input-output behavior of a 
program; for example, a (pre, post) state for a terminating program, or an input-output trace for a reactive 
program. Thus, in this paper, we ignore syntactic issues in representing concepts and model them in terms 
of their semantics, as a set of behaviors. The set of all possible concepts is termed the concept class, 
denoted by Thus, if C 2®. The concept class may either be specified in the original synthesis problem 
or arise as a result of a structure hypothesis that restricts the space of candidate concepts. Depending on 
the application domain, E can be finite or infinite. The concept class ^ can also be finite or infinite. Note 
that it is possible to place (syntactic) restrictions on concepts so that ^ is finite even when E is infinite. 

One key distinguishing characteristic between traditional machine learning and formal inductive syn¬ 
thesis is the presence of an explicit formal specification in the latter. We define a specification <I> as a set 
of “correct” concepts, i.e., <I> C ^ C 2®. Any example x G E such that there is a concept c G <I> where 
X G c is called a positive example. Likewise, an example x that is not contained in any c G <I> is a negative 
example. We will write x h <I> to denote that x is a positive example. An example that is specified to be 
either positive or negative is termed a labeled example. 

Note that standard practice in formal methods is to define a specification as a set of examples, i.e., 
<I> C E. This is consistent with most properties that are trace properties, where represents the set 
of allowed behaviors — traces, (pre,post) states, etc. — of the program. However, certain practical 
properties of systems, e.g., certain security policies, are not trace properties (see, e.g., ItlTl '). and therefore 
we use the more general definition of a specification. 

We now define what it means for a concept to satisfy <I>. Given a concept c G we say that c satisfies 
<I> iff c G <I>. If we have a complete specification, it means that <I> is a singleton set comprising only a 
single allowed concept. In general, <I> is likely to be a partial specification that allows for multiple correct 
concepts. 

We now present a first definition of the formal inductive synthesis problem: 

Given a concept class ^ and a domain of examples E, the formal inductive synthesis prob¬ 
lem is to find, using only a subset of examples from E, a target concept c G that satisfies 
a specification <I> C 

This definition is reasonable in cases where only elements of E can be accessed by the synthesis engine 
— the common case in the use of machine learning methods. However, existing formal verification and 
synthesis methods can use a somewhat richer set of inputs, including Boolean answers to equivalence 
(verification) queries with respect to the specification <I>, as well as verification queries with respect to 
other constructed specifications. Moreover, the synthesis engine typically does not directly access or 
manipulate the specification <I>. In order to formalize this richer source of inputs as well as the indirect 
access to <I>, we introduce the concept of an oracle interface. 

Definition 2.1 An oracle interface ^ is a subset of where ^ is a set of query types, is a corre¬ 

sponding set of response types, and G defines which pairs of query and response types are semantically 
well-formed. ■ 
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A simple instance of an oracle interface is one with a single query type that returns positive examples 
from E. In this case, the synthesis problem is to learn a correct program from purely positive examples. 
The more common case in machine learning (of classifiers) is to have an oracle that supports two kinds 
of queries, one that returns positive examples and another that returns negative examples. As we will see 
in Sec. |2.1[ there are richer types of queries that are commonly used in formal synthesis. For now, we 
will leave ^ and ^ as abstract sets. 

Implementations of the oracle interface can be nondeterministic algorithms which exhibit nondeter- 
ministic choice in the stream of queries and responses. We define fhe nofion of nondeterministic mapping 
fo represenf such algorithms. 

Definition 2.2 A nondeterministic mapping F : I ^ O takes as input i G I and produces an output o G 
0{i) C O where 0{i) is the set of all valid outputs corresponding to input i in F. 

With this notion of an oracle interface, we now introduce our definition of formal inductive synthesis 
(FIS): 

Definition 2.3 Consider a concept class a domain of examples E, a specification <I>, and an oracle 
interface G. The formal inductive synthesis problem is to find a target concept c G ^ that satisfies <I>, 
given only G and ‘G. In other words, E and <I> can be accessed only through G. ■ 

Thus, an instance of FIS is defined in ferms of fhe fuple (^,E,<I>, ^). We nexf infroduce a family 
of solution fechniques for fhe FIS problem. A FIS problem insfance defines an oracle inferface and 
a solution fechnique for fhat problem insfance can access fhe domain E and fhe specificafion <I> only 
fhrough fhaf inferface. 


2.1 OGIS: A family of synthesizers 

Oracle-guided inductive synthesis (OGIS) is an approach to solve the formal inductive synthesis problem 
defined above, encompassing a family of synthesis algorithms. 


INDUCTIVE LEARNING 
ENGINE 

LEARNING BIAS 

Program Template; 

Or Concept Class 

V 

LEARNING 

ALGORITHM 

Memory constraints; 
Time constraints 


ORACLE 


QUERIES 

such as 
membership, 
subsumption, 
witness 


RESPONSES 
such as 

positive examples, 
counterexamples. 
Boolean YesZNo 


KNOWLEDGE 

Complete or Partial 
Specification; 
Human users; 

Cost function to be 
optimized _ 


ALGORITHM 

Equivalence checking; 
Constraint 
optimization; 
Distinguishing input 
generation _ 


Eigure I: Oracle Guided Inductive Synthesis 

As illustrated in Eigure [T] OGIS comprises two key components: an inductive learning engine (also 
sometimes referred to as a “Learner”) and an oracle (also referred to as a “Teacher”). The interaction 
between the learner and the oracle is in the form of a dialogue comprising queries and responses. The 
oracle is defined by the types of queries that it can answer, and the properties of its responses. Synthesis 
is thus an iterative process: at each step, the learner formulates and sends a query to the oracle, and 
the oracle sends its response. Eor formal synthesis, the oracle is also tasked with determining whether 
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the learner has found a eorreet target eoneept. Thus, the oraele implieitly or explieitly maintains the 
speeifieation <I> and ean report to the learner when it has terminated with a eorreet eoneept. 

We first formalize the notions of learner and oraele. Let Q be a set of queries of types J2, and R 
be a set of responses of types We allow both Q and R to inelude a speeial element _L indieating the 
absenee of a query or response. An element {q, r) G Q x R is said to conform to an oraele interfaee 0' if 
q is of type qt, r is of type and {qt,rt) G A valid dialogue pair for an oraele interfaee denoted d, 
is a query-response pair {q, r) sueh that ^ G Q, r G R and {q, r) eonforms to The set of valid dialogue 
pairs for an oraele interfaee is denoted by D and D* denotes the set of valid dialogue sequences — finite 
sequenees of valid dialogue pairs. If 5 G D* is a valid dialogue sequenee, 5[/] denotes a sub-sequenee of 
8 of length i and 5{i) denotes the /-th dialogue pair in the sequenee. 

Definition 2.4 An oraele is a nondeterministic mapping O : D* x Q —> R O is consistent with a given 
interface G iff given a valid dialogue sequence 5 and a query q of type qt, 0{5,q) is a response of type 
rt where (qtAt) G G. A learner is a nondeterministic mapping L : D* — > Q x LA consistent with a 
given interface G iff given a valid dialogue sequence 5, L(5) = (^,c) where ^ G Q has type qt such that 
there exists a response type rt s.t. {qtAt) ^ G. ■ 

We will further assume in this paper that the oraele O is sound, meaning that it gives a eorreet response to 
every query it reeeives. For example, if asked for a positive example, O will not return a negative example 
instead. This notion is left informal for now, sinee a formalization requires diseussion of speeifie queries 
and is orthogonal to the results in our paper. 

Given the above definitions, we ean now define fhe OGIS approaeh formally. 

Definition 2.5 Given a FIS ,'F,^,G), an oraele-guided induefive synthesis fOGISj procedure (en¬ 
gine) is a tuple (0,L), comprising an oraele O : D* x Q —?■ R and a learner L : D* —)• Q x where the 
oracle and learner are consistent with the given oracle interface G as defined above. ■ 

In other words, an OGIS engine eomprises an oraele O that maps a “dialogue history” and a eurrent query 
to a response, and a learner L that, given a dialogue history, outputs a hypothesized eoneept along with a 
new query. Upon eonvergenee, the final eoneept output by L is the output of the OGIS proeedure. 

We also formalize the definition of when an OGIS engine solves an FIS problem. 

Definition 2.6 A dialogue sequence 5 G D* corresponding to OGIS procedure (0,L) is such that 5{i) is 
{q,r) where L(5[/ — 1]) = {q,c)for some query ^ G Q and some concept c G and 0(5[/ — 1],^7) = r. 

The OGIS procedure (0,L) is said to solve the FIS problem with dialogue sequenee 8 if there exists 
an i such that L(5[/]) = (^,c), c G and c satisfies <F, and for all j > i, L(5[7]) = (q' ,c), that is, the 
OGIS procedure converges to a concept c that satisfies <I>. 

The OGIS procedure (0,L) is said to solve the FIS problem if there exists a dialogue sequence 8 
with which it solves that problem. 

■ 

The eonvergenee and eomputational eomplexity of an OGIS proeedure is determined by the nature 
of the FIS problem along with three faetors: (i) the eomplexity of eaeh invoeation of the learner L; (ii) 
the eomplexity of eaeh invoeation of the oraele O, and (hi) the number of iterations (queries, examples) 
of the loop before eonvergenee. We term first two faetors as learner complexity and oracle complexity, 
and the third as sample complexity. Sometimes, in OGIS proeedures, oraele eomplexity is ignored, so 
that we simply eount ealls to the oraele rather than the time spent in eaeh eall. 

An OGIS proeedure is defined by properties of the learner and the oraele. Relevant properties of 
the learner inelude (i) its inductive bias that restriets its seareh to a partieular family of eoneepts and a 
seareh strategy over this spaee, and (ii) resource constraints, sueh as finite or infinite memory. Relevant 
properties of the oraele inelude the types of queries it supports and of the responses it generates. We list 
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below the common query and response types. In each case, the query type is given in square brackets 
as a template comprising a query name along with the types of the formal arguments to that query, e.g., 
examples x or concepts c. An instance of each of these types, that is, a query, is formed by substituting a 
specific arguments (examples, concepts, etc.) for the formal arguments. 

1. Membership query: [^mem(-^)] The learner selects an example x and asks “Is x positive or negative?” 
The oracle responds with a label for x, indicating whether x is a positive or negative example. 

2. Positive witness query: The learner asks the oracle “Give me a positive example”. The oracle 

responds with an example x h <I>, if one exists, and with _L otherwise. 

3. Negative witness query: The learner asks the oracle “Give me a negative example”. The oracle 

responds with an example x 1/ <I>, if one exists, and with _L otherwise. 

4. Counterexample query: [<7ce(c)] The learner proposes a candidate concept c and asks “Does the oracle 
have a counterexample demonstrating that c is incorrect?” (i.e., “proof that c 0 <!>?”). If the oracle can 
find a counterexample x to c 0 <I>, the oracle provides the counterexample. Otherwise, if the oracle 
cannot find any counterexample, it responds with _L. Such a query allows us to accurately model the 
working of counterexample-guided synthesis techniques such as ||35]| where the verification problem 
is undecidable but, if a counterexample is reported, it is a true counterexample. 

5. Correctness query: [^con-(c)] The learner proposes a candidate concept c and asks “Is c correct?” (i.e., 
“does it satisfy <!>?”). If so, the oracle responds “YES” (and the synthesis can terminate). If it is not 
so, the oracle responds “NO” and provides the counterexample. Here x is an example such that either 
X G c but X 1/ <I>, or X 0 c and there exists some other concept c' G <I> containing x. This query is a 
stronger query than counterexample query as it is guaranteed to provide a counterexample whenever 
the proposed c is not correct. 

For the special case of trace properties, the correctness query can take on specific forms. One form is 
termed the equivalence query, denoted i^eq, where the counterexample is in the symmetric difference of 
the single correct target concept and c. The other is termed the subsumption query, denoted (7sub^ where 
the counterexample is a negative example present in c, and is used when <I> is a partial specification 
admitting several correct concepts. It is important to note that, in the general case, a verification query 
does not, by itself, specify any label for a counterexample. One may need an additional membership 
query to generate a label for a counterexample. 

6. Crafted Correctness (Verification) query: [<7ccorr(c,<i’)] As noted earlier, oracles used in formal in¬ 
ductive synthesis tend to be general-purpose decision procedures. Thus, they can usually answer not 
only verification queries with respect to the specification <I> for the overall FIS problem, but also ver¬ 
ification queries for specifications crafted by the learner. We refer to this class of queries as crafted 
correctness/verification queries. The learner asks “Does c satisfy <!>?” for a crafted specification <i> 
and a crafted concept c. 

As for ^corr One can define as special cases a crafted equivalence query type qce and a crafted sub¬ 
sumption query type ^csub- 

7. Distinguishing input query: [^diff(2f,c)] In this query, the learner supplies a finite set of examples X 
and a concept c, where A C c, and asks “Does there exist another concept c' s.t. c / c' and X C c'T If 
so, the oracle responds “YES” and provides both c' and an example x G c © c'. The example x forms a 
so-called “distinguishing input” that differentiates the two concepts c and c'. If no such c' exists, the 
oracle responds “NO”. 

The distinguishing input query has been found useful in scenarios where it is computationally hard to 
check correctness using the specification <I>, such as in malware deobfuscation |[30ll . 

The query/response types ^wit> ^ce, <?corr, <?ccorr and ^diff listed above are not meant to be 

exhaustive. Any subset of such types can form an oracle interface &. We note here that, in the machine 
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learning theory community, there have been thorough studies of query-based learning; see Angluin’s 
review paper Q for details. However, in our formalization of OGIS, new query types such as ^ccoit 
and ^diff are possible due to the previously-identified key differences with traditional machine learning 
including the general-purpose nature of oracle implementations and the ability to select or even design 
the oracle. Moreover, as we will see, our theoretical analysis raises the following questions that are 
pertinent in the setting of formal synthesis where the learner and oracle are typically implemented as 
general-purpose decision procedures: 

• Oracle design: When multiple valid responses can be made to a query, which ones are better, in terms 
of convergence to a correct concept (convergence and complexity)? 

• Learner design: How do resource constraints on the learner or its choice of search strategy affect 
convergence to a correct concept? 

2.2 Examples of OGIS 

We now take three example synthesis techniques previously presented in literature and illustrate how they 
instantiate the OGIS paradigm. These techniques mainly differ in the oracle interface that they employ. 

Example 2.1 Query-based learning of automata jH^: 

Angluin’s classic work on learning deterministic finite automata (DFAs) from membership and equiva¬ 
lence queries l2| is an instance of OGIS with iff = {^mem)<?eq}- The learner is a custom-designed algo¬ 
rithm called L*, whereas the oracle is treated as a black box that answers the membership and equivalence 
queries; in particular, no assumptions are made about the form of counterexamples. Several variants of 
L* have found use in the formal verification literature; see IITHI for more information. 

Example 2.2 Counterexample-guided inductive synthesis (CEGIS) H55\l : 

CEGIS was originally proposed as an algorithmic method for program synthesis where the specification 
is given as a reference program and the concept class is defined using a partial program, also referred to as 
a “sketch” |[55ll . It has since proved very versatile, also applying to partial specifications (see, e.g., EH) 
and other ways of providing syntax guidance; see ||T] for a more detailed treatment. In CEGIS, the 
learner (synthesizer) interacts with a “verifier’ that can take in a candidate program and a specification, 
and try to find a counterexample showing that the candidate program does not satisfy the specification. 
In CEGIS, the learner is typically implemented on top of a general-purpose decision procedure such 
as a SAT solver, SMT solver, or model checker. The oracle (verifier) is also implemented similarly. In 
addition to a counterexample-generating oracle, many instances of CEGIS also randomly sample positive 
examples (see Sec. 5.4 of 1551 and Eig. 3 of f35l ). Moreover, the counterexample-generating oracle is 
not required to be a sound verifier that can declare correctness (e.g., see Il35l i. Thus, we model CEGIS 
as an instance of OGIS with iff = {^Tf,<?ce}- 

As noted earlier, if the verifier is sound (can prove correctness of candidate concept), then q^e can be 
substituted by <7 coit- Moreover, general-purpose verifiers typically support not only correctness queries 
with respect to the original specification, but also crafted correctness queries, as well as membership 
queries, which are special cases of the verification problem where the specification is checked on a 
single input/output behavior. We term an instantiation of CEGIS with these additional query types as 
generalized CEGIS, which has an oracle interface = {^Tj,^coiT)^ccorr,^mem}- We will restrict our 
attention in this paper to the standard CEGIS. 

Example 2.3 Oracle-guided program synthesis using distinguishing inputs l&Ull : 

Our third example is an approach to program synthesis that uses distinguishing inputs when a complete 
specification is either unavailable or it is expensive to verify a candidate program against its specifica¬ 
tion 1301. In this case, distinguishing input queries, combined with witness and membership queries. 
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provide a way to quickly generate a corpus of examples that rule out incorrect programs. When there 
is only a single program consistent with these examples, only then does a correctness query need to be 
made to ascertain its correctness. Thus, the oracle interface ^ with i^corr being 

used sparingly. The learner and the oracle are implemented using SMT solving. 


2.3 Counterexample-Guided Inductive Synthesis (CEGIS) 

Consider the CEGIS instantiation of the OGIS framework. In this paper, we consider a general setting 
where the concept class is the set of programs corresponding to the set of recursive (decidable) lan¬ 
guages', thus, it is infinite. The domain E of examples is also infinite. We choose such an expressive 
concept class and domain because we want to compare how the power of CEGIS varies as we vary the 
oracle and learner. More specifically, we vary fhe nature of responses from fhe oracle fo correcfness and 
wifness queries, and fhe memory available fo fhe learner. 

Eor fhe oracle, we consider four differenl types of counferexamples fhaf fhe oracle can provide in 
response fo a correcfness query. Recall fhaf in formal synfhesis, oracles are general-purpose verifiers 
or decision procedures whose infernal heuristics may defermine fhe fype of counferexample obfained. 
Each fype describes a differenl oracle and hence, a differenl flavor of CEGIS. Our goal is fo compare 
Ihese synfhesis fechniques and esfablish whelher one type of counterexample allows fhe synfhesizer fo 
successfully learn more programs lhan fhe olher. The four kinds of counterexamples considered in Ibis 
paper are as follows: 

1. Arbitrary counterexamples: This is fhe “slandard” CEGIS technique (denoted CEGIS) fhaf makes no 
assumpfions on fhe form of fhe counferexample obfained from fhe oracle. Nole however fhaf our focus 
is on an infinile concepl class, whereas mosl praclical insfanlialions of CEGIS have focused on finite 
concepl classes; Ihus, convergence is no longer guaranteed in our selling. This version of CEGIS 
serves as fhe baseline for comparison againsl olher synfhesis fechniques. 

2. Minimal counterexamples: We require fhaf fhe verificafion oracle provide a counferexample from E 
which is minimal for a given ordering over E. The size of examples can be used for ordering. The exacf 
definilion of “size” is left abslracf and can be defined suilably in differenl conlexls. The inluilion is 
fo use counferexamples of smaller size which eliminales more candidafe concepls. Significanl effort 
has been made on improving validalion engines fo produce counferexamples which aid debugging by 
localizing fhe error B^lTdll . The use of counferexamples in CEGIS conceplually is an ilerafive repair 
process and hence, if is nalural fo exlend successful error localizalion and debugging fechniques fo 
inductive synfhesis. 

3. Constant-bounded counterexamples: Here fhe “size” of fhe counferexamples produced by fhe verifi- 
calion oracle is bounded by a consfanl. This is mofivafed by fhe use of bounds in formal verificafion 
such as bounded model checking lITOl and bug-finding in concurrenl programs Q using bounds on 
conlexl swilches. 

4. Positive-bounded counterexamples: Here fhe counferexample produced by fhe validalion engine musl 
be smaller lhan a previously seen posilive example. This is motivated from fhe induslrial practice 
of validalion by simulalion where fhe system is often simulaled fo a finife lenglh fo discover bugs. 
The lenglh of simulalion often depends on fhe Iraces which illuslrale known posilive behaviors. If is 
expecled fhaf errors will show up if fhe syslem is simulaled up fo fhe lenglh of fhe largesl posilive 
Irace. Mufalion-based software lesfing and symbolic execution also has a similar flavor, where a 
sample correcl execufion is mulaled fo find bugs. 

In addition fo fhe above varialions fo fhe oracle, we also consider Iwo kinds of learners fhaf differ 
based on Iheir ability fo slore examples and counterexamples: 
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1. Infinite memory: In the typical setting of CEGIS, the learner is not assumed to have any memory 
bound, allowing the learner to store as many examples and counterexamples as needed. Note that, for 
an infinite domain, this set of examples can grow unbounded. 

2. Finite memory: A more practical setting is one where the learner only has a finite amount of memory, 
and therefore can only store a finite representation of examples or hypothesized programs. This notion 
of finite memory is similar to that used classically for language learning from examples Il62l . We give 
the first theoretical results on the power of CEGIS and its variants, for general program synthesis, in 
this restricted setting. 


We introduce notation to refer to these variants in a more compact manner. The synthesis engine 
using arbitrary counterexamples and with infinite memory is denoted as Tcegis- The variant of the syn¬ 
thesis engine which is restricted to use finite memory is referred to as Tcegis- Similarly, the synthesis 
engine using minimal counterexamples and infinite memory is called minimal counterexample guided 
inductive synthesis (Tmikcegis)- The variant of this engine using finite memory is referred to as Tniincegis- 


The synthesis engine using counterexamples which are smaller than a fixed consfanf is called a constanf 
bounded counferexample guided inductive synfhesis, and is denofed as Tcbcegis if the memory is nol fi- 
nife and Tcbcegis if the memory is finite. The synfhesis engine using counterexamples which are smaller 
fhan fhe largesl posifive examples is called posifive-hisfory bounded counferexample guided inducfive 
synfhesis, and is denofed as Tpbcegis if the memory is nof finite and Tpbcegis if the memory is finife. 

Eor fhe class of programs corresponding to the set of recursive languages, our focus is on learning 
in the limit, that is, whether the synthesis technique converges to the correct program or not (see Defi¬ 
nition 4.14 in Sec. for a formal definition). This question is non-trivial since our concept class is not 
finite. In this paper, we do not discuss computational complexity of synthesis, and the impact of different 
types of counterexamples on the speed of convergence. Investigating the computational complexity for 
concept classes for which synthesis is guaranteed to terminate is left as a topic for future research. 

We also present an initial complexity analysis for OGIS in case of finite concept classes. The de¬ 
cidability question for finite class of programs is trivial since convergence is guaranteed as long as the 
queries provide new examples or some new information about the target program. But the speed at 
which the synthesis approach converges remains relevant even for finite class of programs. We show 
that the complexity of these techniques is related to well-studied notions in learning theory such as the 
Vapnik-Chervonenkis dimension l[12l and the teaching dimension ll20l . 


3 Background and Related Work 

In this section, we contrast the contributions of this paper with the most closely related work and also 
provide some relevant background. 

3.1 Formal Synthesis 

The past decade has seen an explosion of work in program synthesis (e.g. ll54ll55ll^l5^lT7ll58l . More¬ 
over, there has been a realization that many of the trickiest steps in formal verification involve synthesis 
of artifacts such as inductive invariants, ranking functions, assumptions, etc. am 121. Most of these 
efforts have focused on solution techniques for specific synthesis problems. There are two main unifying 
characteristics across most of these efforts: (i) syntactic restrictions on the space of programs/artifacts to 
be synthesized in the form of templates, sketches, component libraries, etc., and (ii) the use of inductive 
synthesis from examples. The recent work on syntax-guided synthesis (SyGuS) |Tj is an attempt to cap¬ 
ture these disparate efforts in a common theoretical formalism. While SyGuS is about formalizing the 
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synthesis problem, the present paper focuses on formalizing common ideas in the solution techniques. 
Specifically, we present OGIS as a unifying formalism for different solution techniques, along with a the¬ 
oretical analysis of different variants of CEGIS, the most common instantiation of OGIS. In this sense, 
it is complementary to the SyGuS effort. 

3.2 Machine Learning Theory 

Another related area is the field of machine learning, particularly the theoretical literature. In Section [T] 
we outlined some of the key differences between the fields of formal inductive synthesis and that of 
machine learning. Here we focus on the sub-field of query-based learning that is the closest to the OGIS 
framework. The reader is referred to Angluin’s excellent papers on the topic for more background ||4j|5l. 

A major difference between the query-based learning literature and our work is in the treatment of 
oracles, specifically, how much control one has over the oracle that answers queries. In query-based 
learning, the oracles are treated as black boxes that answer particular types of queries and only need to 
provide one valid response to a query. Moreover, it is typical in the query-based learning literature for 
the oracle to be specified a priori as part of the problem formulation. In contrast, in our OGIS frame¬ 
work, designing a synthesis procedure involves also designing or selecting an oracle. The second major 
difference is that the query-based learning literature focuses on specific concept classes and proves con¬ 
vergence and complexity results for those classes. In contrast, our work proves results that are generally 
applicable to programs corresponding to recursive languages. 

3.3 Learning of Formal Languages 

The problem of learning a formal language from examples is a classic one. We cover here some relevant 
background material. 

Gold |[T9ll considered the problem of learning formal languages from examples. Similar techniques 
have been studied elsewhere in literature Il29l l63l [TTl |2]|. The examples are provided to learner as an 
infinite stream. The learner is assumed to have unbounded memory and can store all the examples. 
This model is unrealistic in a practical setting but provides useful theoretical understanding of inductive 
learning of formal languages. Gold defined a class of languages to be identifiable in the limit if there is a 
learning procedure which identifies the grammar of the target language from the class of languages using 
a stream of input strings. The languages learnt using only positive examples were called text leamable 
and the languages which require both positive and negative examples were termed informant leamable. 
None of the standard classes of formal languages are identifiable in the limit from text, that is, from only 
positive examples ifTOll . This includes regular languages, context-free languages and context-sensitive 
languages. 

A detailed survey of classical results in learning from positive examples is presented by Lange et 
al. ||39ll . The results summarize learning power with different limitations such as the inputs having certain 
noise, that is, a string not in the target language might be provided as a positive example with a small 
probability. Learning using positive as well as negative examples has also been well-studied in literature. 
A detailed survey is presented in ETIl and l38l . Lange and Zilles Il40ll relate Angluin-style query-based 
learning with Gold-style learning. They establish that any query learner using superset queries can be 
simulated by a Gold-style learner receiving only positive data. But there are concepts leamable using 
subset queries but not Gold-style leamable from positive data only. Learning with equivalence queries 
coincides with Gold’s model of limit learning from positive and negative examples, while learning with 
membership queries equals finite learning from positive data and negative data. In contrast to this line of 
work, we present a general framework OGIS to learn programs or languages and Angluin-style or Gold- 


12 


Formal Inductive Synthesis 


style approaches can be instantiated in this framework. Our theoretical analysis focusses on varying the 
oracle and the nature of counterexample produced by it to examine the impact of using different types of 
counterexamples obtainable from verification or testing tools. 

3.4 Learning vs. Teaching 

We also study the complexity of synthesizing programs from a finite class of programs. This part of 
our work is related to previous work on the complexity of teaching in exact learning of concepts by 
Goldman and Kearns llIOll . Informally, the teaching dimension of a concept class is the minimum number 
of instances a teacher must reveal to uniquely identify any target concept from the class. Exact bounds 
on teaching dimensions for specific concepf classes such as orfhogonal recfangles, monofonic decision 
frees, monomials, binary relations and fofal orders have been previously presented in liferafure ll20ll2T]| . 
Shinohara ef al. ||5^ also infroduced a nofion of feachabilify in which a concepf class is feachable by 
examples if fhere exisfs a polynomial size sample under which all consisfenf learners will exacfly idenlify 
fhe fargef concepf. Salzberg ef al. IfSOll also consider a model of learning wifh a helpful feacher. Their 
model requires fhaf any feacher using a parficular algorifhm such as fhe nearesf-neighbor algorifhm learns 
fhe large! concepf. This work assumes lhal fhe feacher knows fhe algorifhm used by fhe learner. We do 
nof make any assumplion on fhe inductive learning lechnique used by fhe OGIS synlhesis engine. Our 
goal is lo oblain bounds on fhe number of examples lhal need lo be provided by fhe oracle lo synfhesize 
fhe correcl program by relaling our framework lo fhe liferafure on teaching. 

4 Theoretical Analysis of CEGIS: Preliminaries 

Our presenfalion of formal inductive synlhesis and OGIS so far has nof used a parficular represenfa- 
lion of a concepf class or specificalion. In Ihis seclion, we begin our Iheorelical formalization of fhe 
counlerexample-guided inducfive synlhesis (CEGIS) lechnique, for which such a choice is necessary. 
We precisely define fhe formal inducfive synlhesis problem for concepls lhal correspond fo recursive 
languages. We reslricl our allenlion lo fhe case when fhe specificalion is parlial and is a Irace properly — 
i.e., fhe specification is defined by a single formal language. This assumption, which is fhe fypical case 
in formal verificafion and synlhesis, also simplifies nolalion and proofs. Mosl of our resulls exlend lo 
fhe case of more general specificalions; we will make suilable addilional remarks aboul fhe general case 
where needed. Eor ease of reference, fhe major definitions and frequenlly used nolalion are summarized 
in Table [T] 

4.1 Basic Notation 

We use N lo denote fhe sef of nafural numbers. N,- C N denotes a subsel of nalural numbers N, = {n\n < 
/}. Consider a sef 5 C N. min(5') denoles fhe minimal elemenl in S. The union of fhe sels is denoled by 
U and fhe inlerseclion of fhe sels is denoled by n. Si \5'2 denotes sef minus operation wifh fhe resulfanf 
sef confaining all elemenfs in Si and nof in S 2 . 

We denote fhe sef N U {_L} as Nj^. A sequence a is a mapping from N lo Nj^. We denote a prefix 
of lenglh k of a sequence by o[k]. So, o[k] of lenglh k is a mapping from Nt to Nj^. a[0] is an empty 
sequence also denoted by ao for brevity. The set of natural numbers appearing in the sequence a [/] is 
defined using a function SAMPLE, where SAMPLE(a[/]) = range{G[i]) — {-L}. The set of sequences is 
denoted by £. 

Languages and Programs: We also use standard definitions from computability theory which relate 
languages and programs ll49l . A set L of natural numbers is called a computable or recursive language if 
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there is a program, that is, a computable, total function P such that for any natural number n, 

P{n) = 1 if n G L and P{n) = 0 if n 0 L 

We say that P identifies the language L. Let L^ap (P) denote the language L identified by the program 
P. The mapping L,„ap is not necessarily one-to-one and hence, syntactically different programs might 
identify the same language. In formal synthesis, we do not distinguish between syntactically different 
programs that satisfy the specification. Additionally, in this paper, we restrict our discussion to recur¬ 
sive languages because it includes many interesting and natural classes of languages that correspond to 
programs and functions of various kinds, including regular, context free, context sensitive, and pattern 
languages. 

Given a sequence of non-empty languages .if = Lo,Li,L 2 ,..., .if is said to be an indexed family 
of languages if and only if for all languages L,-, there exists a recursive function TEMPLATE such that 
TEMPLATE( 7 ,n) = P{n) and L^apiP) = Li for some j. Practical applications of program synthesis often 
consider a family of candidate programs which contain syntactically different programs that are seman¬ 
tically equivalent, that is, they have the same set of behaviors. Formally, in practice program synthesis 
techniques permit picking j such that TEMPLATE( 7 ,n) = P{n) and L,„ap{P) = Li for all j G Ij where the 
set Ij represents the syntactically different but semantically equivalent programs that produce output 1 on 
an input if and only if the input natural number belongs to L,. Intuitively, a function TEMPLATE defines 
an encoding of the space of candidate programs similar to encodings proposed in the literature such as 
those on program sketching Il55ll and component interconnection encoding ||30)I . In the case of formal 
synthesis where we have a specification <I>, we are only interested in finding a single program salisfying 
<I>. In the general case, <I> comprises a set of allowed languages, and the task of synthesis is to find a pro¬ 
gram identifying some element of this set. In the case of partial specifications that are trace properties, 
<I> comprises subsets of a single target language Lc. Any program Pc identifying some subset of Lc is a 
valid solution, and usually positive examples are used to rule out programs identifying “uninteresting” 
subsets of Lc. Thus, going forward, we will define fhe task of program synthesis as one of identifying 
the corresponding correct language Lc. 

Ordering of elements in the languages; A language corresponds to a set of program behaviors. We 
model this set in an abstract manner, only assuming the presence of a total order over this set, without 
prescribing any specific ordering relation. Thus, languages are modeled as sets of natural numbers. While 
such an assumption might seem restrictive, we argue that this is not the case in the setting of CEGIS, 
where the ordering relation is used specifically to model the oracle’s preference for returning specific 
kinds of counterexamples. For example, consider the case where elements of a language are input/output 
traces. We can construct a totally ordered set of all possible input/output traces using the length of the 
trace as the primary ordering metric and the lexicographic ordering as the secondary ordering metric. 
Thus, an oracle producing smallest counterexample would produce an unique trace which is shortest 
in length and is lexicographically the smallest. The exact choice of ordering is orthogonal to results 
presented in our paper, and using the natural numbers allows us to greatly simplify notation. 

4.2 CEGIS Definitions 

We now specialize the definitions from Sec. for the case of CEGIS. An indexed family of languages 
(also called a language class) .if defines fhe concept class if” for synthesis. The domain E for synthesis 
is the set of natural numbers N and the examples are i G N. Recall that we restrict our attention to 
the special case where the specification <I> is captured by a single target language, i.e., Lc comprising 
all permitted program behaviors. Therefore, the formal inductive synthesis (EIS) problem defined in 
Secfion[^(Definition|2.3|) can be restricfed for this setting as follows: 
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Definition 4.1 Given a language class a domain of examples N, the specification <I> defined by a 
target language Lc, and an oracle interface G, the problem of formal inductive synthesis of languages 
(and the associated programs) is to identify a language in <I> using only the oracle interface G. 

Counterexample-guided inductive synthesis (CEGIS) is a solution to the problem of formal inductive 
synthesis of languages where the oracle interface G is defined as follows. 

Definition 4.2 A counterexample-guided inductive synthesis (CEGIS) oracle interface is G = ^ x Idf 
where ^ = {qG^^qce{L)^ with L G Sf = Nj^, and the specification <I> is defined as subsets of a target 
language Lc. The positive witness query qG^ returns a positive example i G Lc, and the counterexample 
query qce takes as argument a candidate language L and either returns a counterexample i € L\Lc 
showing that the candidate language L is incorrect or returns _L if it cannot find a counterexample. 


Symbol 

Meaning 

Symbol 

Meaning 

N 

min(5') 

SinS2 

a 

a[/] 

Li 

Pi 

SAMPLE(a) 

natural numbers 

minimal element in set S 
set intersection 
sequence of numbers 
sequence of length i 
language (a subset of N) 
program for L, 
natural numbers in a 
family of languages 

N,- 

Sx\S2 

S 1 US 2 

ao 

o{i) 

Li 

Lmap{Pi) — Li 

I 

natural numbers less than i 

set minus 
set union 
empty sequence 
fth element of sequence a 
complement of language 
language corresponding to P, 
set of sequences 
family of programs 

% 

T 

CHECKl 

CBCHECKbx 

CEGIS 

MINCEGIS 

CBCEGIS 

PBCEGIS 

transcript 
synthesis engine 
verification oracle for L 
bounded counterexample oracle 
set of language families identified by 
inf memory cegis engine 

CEGIS with MINCHECK 

CEGIS with CBCHECK for a given 
constant B 

CEGIS with PBCHECK 

cex 

learn 

MINCHECKl 

PBCHECKl 

cegis 

mincegis 

cbcegis 

pbcegis 

counterexample transcript 
inductive learning engine 
minimal counterexample oracle 
positive bounded counterexample oracle 
set of language families identified by 
finite memory cegis engine 

cegis with MINCHECK 
cegis with CBCHECK for a given 
constant B 

cegis with PBCHECK 


Table 1: Erequently used notation in the paper 


The sequence T of responses of the positive witness qL^ query is called the transcript, and the se¬ 
quence cex of the responses to the counterexample queries qce is called the counterexample sequence. 
The positive witness queries can be answered by the oracle sampling examples from the target language. 
Our work uses the standard model for language learning in the limit |[T^ . where the learner has access 
to an infinite stream of positive examples from the target language. This is also realistic in practical 
CEGIS settings for infinite concept classes (e.g. |l35l) where more behaviors can be sampled over time. 
We formalize these terms below. 

Definition 4.3 A transcript z for a specification language Lc is a sequence with SAMPLE(t) = Lc. t[/] 
denotes the prefix of the transcript T of length i. z{i) denotes the i-th element of the transcript. 

'CEGIS techniques in literature I55II35I initiate search for correct program using positive examples and use specification to 
obtain positive examples corresponding to counterexamples. 
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Definition 4.4 A counterexample sequence cexfor a specification language from a counterexample 
query qce is a sequence with cex(/) = qce{Lcandi), where cex[/] denotes the prefix of the counterexample 
sequence cex of length i, cex(/) denotes the i-th element of the counterexample sequence, and Lcand^ i^ 
the argument of the i-th invocation of the query qce- 

We now define the verification oracle in CEGIS that produces arbitrary counterexamples, as well as 
its three other variants which generate particular kinds of counterexamples. 

Definition 4.5 A verifier CHECK^/or language L is a nondeterministic mapping from to Nj^ such that 

CHECKi(L,) = _L if and only ifLj C L, and CHECK^,(L,) € Lj\L otherwise. 

Remark: For more general specifications <I> that are a set of languages, the definition of CHECK^, changes 
in a natural way: it returns _L if and only if L, G <I> and otherwise returns an example j that is in the 
intersection of the symmetric differences of each language L G <I> and the candidate language L,. 

We define a minimal counterexample generating verifier below. The counterexamples are minimal 
with respect to the total ordering on the domain of examples. 

Definition 4.6 A verifier MINCHECK/,/or a language L is a nondeterministic mapping from to Nj^ 
such that MINCHECKi(L,) = _L if and only if Li C L, and MINCHECKi(L,) = min(L,' \L) otherwise. 

Next, we consider another variant of counterexamples, namely (constant) bounded counterexamples. 

Bounded model-checking ITOl returns a counterexample trace for an incorrect design if it can find a 
counterexample of length less than the specified constant bound. It fails to find a counterexample for 
an incorrect design if no counterexample exists with length less than the given bound. Verification of 
concurrent programs by bounding the number of context switches f7l is another example of the bounded 
verification technique. This motivates the definition of a verifier which returns counterexamples bounded 
by a constant B. 

Definition 4.7 A verifier CBCHECKb L is a nondeterministic mapping from to Nj^ such that CBCHECKB,z,(^r) = 
m where m € Li\L Am < Bfor the given bound B, and CBCHECKb ^(L,) = _L if such ni does not exist. 

The last variant of counterexamples is positive bounded counterexamples. The verifier for generating 
positive bounded counterexample is also provided with the transcript seen so far by the synthesis engine. 

The verifier generates a counterexample smaller than the largest positive example in the transcript. If 
there is no counterexample smaller than the largest positive example in the transcript, then the verifier 
does not return any counterexample. This is motivated by the practice of mutating correct traces to find 
bugs in programs and designs. The counterexamples in these techniques are bounded by the size of 
positive examples (traces) seen so far|^ 

Definition 4.8 A verifier PBCHECK/, is a nondeterministic mapping from xT, to such that PBCHECKi(L;, T[n]) 

m where m ^ Li\L Am < T{j) for some j<n, anr/PBCHECKi(L/, ^[n]) = L if such m does not exist. 

We now define the oracle for counterexample guided inductive synthesis. We drop the queries in 
dialogue since there are only two kind of queries and instead only use the sequence of responses: tran¬ 
script T and the counterexample sequence cex. The oracle also receives as input the current candidate 
language Lccmd to be used as the argument of the qcon query. The overall response of the oracle is a pair 
of elements in Nj^. 

^Note that we can extend this definition to include counterexamples of size bounded by that of the largest positive example 
seen so far plus a constant. The proof arguments given in Sec.ISlcontinue to work with only minor modifications. 
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Definition 4.9 An oracle O for counterexample-guided inductive synthesis fCEGIS oraclej is a nonde- 
terministic mapping'Lx'E, x ^ Nj^ x Nj^ such that 0(t[/ — 1], cex[/— i\,Lcand) = ('^'(0) where 

z{i) is the nondeterministic response to positive witness query and cex(/) is the nondeterministic re¬ 
sponse to counterexample query qce{Lcand)- The oracle can use any of the four verifiers presented earlier 
to generate the counterexamples. An oracle using CHECK^ is called Ocegis. one using MINCHECK/, is 
called Omincegis- One using PBCHECK^ is called Opbcegis and one using CBCHECKbx is called Ocbcegis- 

We make the following reasonable assumption on the oracle. The oracle is assumed to be consistent: 
it does not provide the same example both as a positive example (via a positive witness query) and as 
a negative example (as a counterexample). Second, the oracle is assumed to be non-redundant: it does 
not repeat any positive examples that it may have previously provided to the learner; for a finite target 
language, once the oracle exhausts all positive examples, it will return _L. 

The learner is simplified to be a mapping from the sequence of responses to a candidate program. 

Definition 4.10 An infinite memory learner LEARN is a function £ x £ —)■ .if such that LEARN(T[n], cex[n]) 
L where L includes all positive examples in z[n] and excludes all examples in cex[n]|^LEARN(ao,CJo) is 
a predefined constant representing an initial guess Lq of the language, which, for example, could be N. 

We now define a finite memory learner which cannot take the unbounded sequence of responses as 
argument. The finite memory learner instead uses the previous candidate program to summarize the 
response sequence. We assume that languages are encoded in terms of a finite representation (index of 
the language since the language class is an indexed family of languages and we assume that every index 
needs unit memory) such as a program that identifies that language. Such an iterative learner only needs 
finite memory. 

Definition 4.11 A finite memory learner learn is a recursive function x Nr x Nr —)• such that 
for all n > 0, learn(L„, t(?i), cex(n)) = L„ri> where L„ri includes all positive examples in z[n\ and 
excludes all examples in cex[n]. We define Lq = LEARN(ao, do) to be the initial guess of the language, 
which for example, could be N. For ease of presentation, we omit the finite memory available to the 
learner in its functional representation above. The learner can store additional finite information. 


The synthesis engine using infinite memory can now be defined as follows. 

Definition 4.12 An infinite memory CEGIS engine Tcegis pair (Ocegis > LEARN) comprising a CEGIS 
oracle Ocegis cind an infinite memory learner LEARN, where, there exists Z and cex such that for all i > 0, 
Ocegis("f[f], cex[/],L;) = + l),cex(/+ 1)) and Li = LEARN(t[/], cex[/]). Since the oracle Ocegis E 

nondeterministic, Tcegis cnn have multiple transcripts Z and counterexample sequences cex. 

A synthesis engine with finite memory cannot store unbounded infinite transcripts. So, the bounded 
memory cegis synthesis engine Tcegis uses a finite memory learner learn. 


Definition 4.13 A finite memory cegis engine Tcegis ^ a tuple (Ocegis; learn) comprising a CEGIS 
oracle Ocegis cind a finite memory learner learn where, there exists Z and cex such that for all i > 
0, Ocegis('f [f], cex[/],L/ri) = ('f(/ + 1), cex(/ + 1)) and Li = learn(L,, t(/), cex(/)). Since the oracle 
Ocegis A nondeterministic, Tcegis can have multiple transcripts Z and counterexample sequences cex. 


A pair (t, cex) is a valid transcript and counterexample sequence for Tc egis if the above definitions 
hold for that pair. We denote this by (t, cex) |= Tcegis- Similar to Definition 2.5 the convergence of the 
counterexample-guided synthesis engine is defined as follows: 


^This holds due to the specialization of <t> to a partial specification, and as a trace property. For general <t>, the learner need 
not exclude all counterexamples. 
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Definition 4.14 We say that : (Ocegisj learn) identifies L, that is, it converges to L, written 

Tcegis L if and only if there exists k such that for all n>k, learn(L„, T[n], cex[n]) = Lfor all valid 
transcripts T and counterexample sequences cex ofTcegis- 


This notion of convergence is standard in language learning in the limit llT9l . For the case of general 
specifications <I>, as given in Definition |2.5[ the synthesizer must converge to some language in <I>. As 
per Definition |4.3[ a transcript is an infinite sequence of examples which contains all the elements in the 
target language. Definition 4.14 requires the synthesis engine to converge to the correct language after 
consuming a finite part of the transcript and counterexample ^quence. This notion of convergence is 
standard in the literature on language learning in the limit 


We extend Definition 4.14 to general specifications <I> as follows: Tcegis identifies a specification 
<I> if if idenfifies some language in <I>. As nofed before, fhis section focuses on fhe case of a partial 
specification fhaf is a frace properly. In fhis case, <I> comprises all subsels of a largel language Lc. Since 
Definifion 4.3 defines a Iranscripf as comprising all positive examples in Lc and Definition 4.14 requires 
convergence for all possible Iranscripls, fhe Iwo nolions of idenlifying <I> and idenlifying Lc coincide. 
We Iherefore focus in Sec. [^purely on language idenfificalion wilh fhe observafion fhaf our resulfs carry 
over lo fhe case of “specificalion idenfificalion”. 


Definition 4.15 Tcegis = (O cegis) learn) identifies a language family ^ if and only ifTcegis identifies 
every language L G .if. 


The above definition exfends lo families of specificafions in an exaclly analogous manner. We now 
define fhe sef of language families fhaf can be identified by fhe inductive synfhesis engines as cegis 
formally below. 

Definition 4.16 cegis = { .if | 3learn VOcegis • the engine Tceg±s = (Ocegisj learn) identifies .if}. 


The convergence of synthesis engine to the correct language, identification condition for a language, 
and language family identified by a synthesis engine are defined similarly as listed in Table 


Learner / Oracle 

Ocegis 

Ofnincegis 

Opbcegis 

Finite memory learn 
Infinite memory LEARN 

^cegis ; C6gis 
Tcegis, CEGIS 

^incegis ; niinCGgis 
Wegis,MINCEGIS 

^bcegis ; pbCGgis 
^PBCEGIS; PBCEGIS 


Table 2: Synthesis engines and corresponding sets of language families 

The constant bounded counterexample-guided inductive synthesis oracle Ocbcegis uses the verifier 
CBCHECKsx- It takes an additional parameter B which is the constant bound on the maximum size of a 
counterexample. If the verifier cannot find a counterexample below this bound, it will respond with _L. 

Definition 4.17 Given a bound B, Tcbcegis = (Ocbcegis, learn) where Ocbcegis uses CBCHECKsx, 
say that Tcbcegis identifies a language family .if if and only i/Tcbcegis identifies every language L G .if. 

Note that the values of B for which a language family .if is identifiable can be different for different 
.if. The overall class of language families identifiable using Ocbcegis oracles can thus be defined as 
follows: 

Definition 4.18 cbcegis = { .if | 3B 3learn . VOcbcegis ■^4. Ocbcegis uses CBCHECKex • the engine 
T’cegis = (Ocbcegis, learn) identifies .if} 

this framework, a synthesis engine is only required to converge to the correct concept without requiring it to recognize 
it has converged and terminate. For a finite concept or language, termination can be trivially guaranteed when the oracle is 
assumed to be non-redundant and does not repeat examples. 
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5 Theoretical Analysis of CEGIS: Results 

In this section, we present the theoretical results when the class of languages (programs) is infinite. We 
consider two axes of variation. We first consider the case in which the inductive learning technique has 
finite memory in Section [5TT| and then the case in which it has infinite memory in Section [5^ For both 
cases, we consider the four kinds of counterexamples mentioned in Section [T] and Section Q namely, 
arbitrary counterexamples, minimal counterexamples, constant bounded counterexamples and positive 
bounded counterexamples. 

For simplicity, our proofs focus on the case of partial specifications that are trace properties, the 
common case in formal verification and synthesis. Thus, <I> comprises subsets of a target specification 
language Lc. However, many of the results given here extend to the case of general specifications. Most 
of our theorems show differences between language classes for CEGIS variants — i.e., theorems showing 
that there is a specification on which one variant of CEGIS converges while the other does not — and 
for these, it suffices fo show such a difference for fhe more resfricfed class of partial specifications. The 
resulfs also exfend fo fhe case of equalify befween language classes (e.g.. Theorem |5.1| ) in cerfain cases; 
we make suifable remarks alongside. 

5.1 Finite Memory Inductive Synthesis 

We investigate fhe four language classes cegis,inincegis, cbcegis and pbcegis identified by fhe 
synfhesis engines Tcegis, T^incegis, Tcbcegis and Tpbcegis and esfablish relations befween fhem. We show 
fhaf cbcegis C mincegis = cegis, pbcegis ^ cegis and pbcegis 2 cegis. 

5.1.1 Minimal vs. Arbitrary Counterexamples 

We begin by showing fhaf replacing a deducfive verificalion engine which refurns arbifrary counferex- 
amples wifh a deducfive verification engine which refums minimal counferexamples does nol change fhe 
power of counferexample-guided inductive synfhesis. The resulf is summarized in Theorem |5.1| 

Theorem 5.1 The power of synthesis techniques using arbitrary counterexamples and those using min¬ 
imal counterexamples are equivalent, that is, mincegis = cegis. 

Proof MINCHECKi is a special case of CHECK^ in fhaf a minimal counferexample reported by MINCHECKi 
can be freafed as arbifrary counferexample fo simulate Tcegis using Tmincegis- Thus, cegis C mincegis. 

The more interesting case fo prove is mincegis C cegis. Eor a language L, lef mincegis converge 
fo fhe correcf language L on franscripf T. We show fhaf Tcegis can simulate Tiiincegis and also converge 
fo L on franscripf T. The proof idea is fo show fhaf a finife learner can simulafe MINCHECK/, by making 
a finife number of calls fo CHECK/,. Therefore, fhe learner sees fhe same counferexample sequence wifh 
CHECK/, as wifh MINCHECK/, and fhus converges fo fhe same language in bofh cases. 

Consider an arbifrary step of fhe dialogue befween learner and verifier when a counferexample is 
relumed. Eel fhe arbifrary counterexample relumed by fhe verifier for a candidate language L, be c, fhaf 
is CHECK/,(L,) = c. Thus, c is an upper bound on fhe minimal counterexample relumed by MINCHECK/,. 
The lalfer can be recovered using fhe following characterization: 

MINCHECKl(L,) = minimum j such fhaf CHECKidj}) is nol _L for 0 < y < CHECK/,(L;) 

The learner can fhus perform af mosl c queries fo CHECK/, fo compule fhe minimal counferexample fhaf 
would be relumed by MINCHECK/,. In case of folally ordered sel (such as N), we could do Ihis more 
efficienlly using binary search. Al each slage of fhe iferalion, fhe learner needs fo store fhe smallesl 
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counterexample returned so far. Thus, the work performed by the learner in each iteration to craft queries 
to CHECK/, can be done with finite memory. MINCHECK/,(L,) can be computed using finite memory and 
using at most c = CHECK/,(L,) calls of CHECK/,. 

Thus, Tcegis can simulate r^incegis by finding the minimal counterexample at each step using the 
verifier CHECK iteratively as described above. This implies that mincegis = cegis. ■ 

Thus, mincegis successfully converges to the correct language if and only if cegis also successfully 
converges to the correct language. So, there is no increase or decrease in power of synthesis by using the 
deductive verifier that provides minimal counterexamples. 

Remark: The above result (and its analog in Sec. |5.2| ) also holds in the case of general specifications 
when CEGIS is replaced by Generalized CEGIS. In particular, if either crafted correctness (^ccorr) or 
membership queries (^mem) are introduced, then it is easy to show that cegis can simulate mincegis 
by mimicking each step of mincegis by recovering the same counterexample it used with suitable ^mem 
or ^ccorr queries. In this case, cegis can converge to every language that mincegis converges to, and 
hence identifies the same class of specifications. 


5.1.2 Bounded vs. Arbitrary Counterexamples 


We next investigate cbcegis and compare its relative synthesis power to cegis. As intuitively ex¬ 
pected, cbcegis is strictly less powerful than cegis as summarized in Theorem 5.2 which formalizes 
the intuition. 


Theorem 5.2 The power of synthesis techniques using bounded counterexamples is less than those using 
counterexamples, that is, cbcegis C cegis. 

Proof Since bounded counterexample is also a counterexample, we can easily simulate a bounded veri¬ 
fier CBCHECK using a CHECK by ignoring counterexamples from CHECK if they are larger than a specified 
bound B which is a fixed parameter and can be stored in the finite memory of the inductive learner. Thus, 
cbcegis C cegis. 

We now describe a language class for which the corresponding languages cannot be identified using 
bounded counterexamples. 

Language Family 1 .• ^notcb = {L;|f > B and L,- = {n|n G N A n > /}} where B is a constant bound. 

We provide this by contradiction. Eet us assume that there is a Tcbcegis that can identify languages in 
■^notcb- Let the verifier used by Tcbcegis be CBCHECK and B' be the constant bound on the counterexamples 
produced by CBCHECK. Let us consider the languages .Sfnotcbfaii = {Lj\Lj G .^notcb A j > B'} ££notcb, 
the set of counterexamples that can be produced by CBCHECK is the same for these languages (that is, 
{n|n G N A n < B'}) since the counterexamples produced by CBCHECK cannot be larger than B'. Thus, 
a synthesis engine Tcbcegis cannot distinguish between languages in ^notcbfaii which is a contradiction. 
Thus, Tcbcegis cannot identify all languages in ^notcb- Tcegis can identify all languages in T£notcb using 
a simple learner which proposes L, as the hypothesis language if i is the smallest positive example seen 
so far. So, cbcegis C cegis. ■ 

We next analyze pbcegis, and show that it is not equivalent to cegis or contained in it. So, replac¬ 
ing a deductive verification engine which returns arbitrary counterexamples with a verification engine 
which returns counterexamples bounded by history of positive examples has impact on the power of 
the synthesis technique. But this does not strictly increase the power of synthesis. Instead, the use of 
positive history bounded counterexamples allows languages from new classes to be identified but at the 
same time, language from some language classes which could be identified by cegis can no longer be 
identified using positive bounded counterexamples. The main result regarding the power of synthesis 
techniques using positive bounded counterexamples is summarized in Theorem |5. 3 [ 
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Theorem 5.3 The power of synthesis techniques using arbitrary counterexamples and those using posi¬ 
tive bounded counterexamples are not equivalent, and none is more powerful than the other, pbcegis 7 ^ 
cegis. In fact, pbcegis ^ cegis and cegis ^ pbcegis. 


We prove this using the following two lemmas. The first lemma 5.4 shows that there is a family 
of languages from which a language can be identified by cegis buf, fhis cannof be done by pbcegis. 
The second lemma 5.5 shows fhaf fhere is anofher family of languages from which a language can be 
idenlified by pbcegis buf nol by cegis. 


Lemma 5.4 There is a family of languages ^ such that pbcegis cannot identify every language L in 
but cegis can do so, that is, cegis ^ pbcegis. 

Proof Now, consider fhe language family formed by upper bounding fhe elemenfs by some fixed 
consfanf. Lef fhe fargef language L (for which we wanf fo idenlify L,. In resf of fhe proof, we also refer 
fo fhis family as for brevify. 

Language Family 2 .^notpb = {U\i £ N} such that Li = {n\n G N A n < /}. 

If we obfain a franscripf z[j] af any poinf in synfhesis using positive bounded counferexamples, 
fhen for any infermediafe language Lj proposed by Tpbcegis. PBCHECK/, would always refurn _L since all 
fhe counferexamples would be larger fhan any elemenf in z[j]. This is fhe consequence of fhe chosen 
languages in which all counferexamples fo fhe language are larger fhan any posifive example of fhe 
language. So, Tpbcegis cannof identify fhe fargef language L. 

Buf we can easily design a synfhesis engine Tcegis using arbifrary counferexamples fhaf can synfhe- 
size P corresponding fo fhe fargef language L. The algorifhm sfarfs wifh Lq as ifs inifial guess. If fhere 
is no counferexample, fhe algorifhm nexf guess is Li. In each iferafion j, fhe algorifhm guesses Lj+i as 
long as fhere are no counferexamples. When a counterexample is relumed by CHECK^, on fhe guess Ly+i, 
the algorithm stops and reports the previous guess Lj as the correct language. 

Since the elements in each language L, is bounded by some fixed consfanf i, fhe above synfhesis 
procedure Tcegis is guaranfeed fo ferminale after i iferafions when idenfifying any language Li G .if. 
Further, CHECK/, did nol refurn any counferexample up fo iferafion 7 — 1 and so, Li- And in fhe nexf 
iferafion, a counferexample was generaled. So, Ly+i ^ L,-. Since, fhe languages in Lf form a monolonic 

chain Lq<zL\ _So, Lj = L,. In fad, j = i and in fhe /-fh iferafion, fhe language Li is correclly idenlified 

by Tcegis- Thus, cegis % pbcegis. ■ 

This shows fhaf cegis can be used fo identify languages when pbcegis will fail. Puffing a reslricfion 
on fhe verifier fo only produce counterexamples which are bounded by fhe posifive examples seen so far 
does nol sfriclly increase fhe power of synfhesis. 

We now show fhaf fhis reslricfion enables idenfificalion of languages which cannof be identified by 
cegis. 

In fhe proof below, we conslrucl a language which is nof distinguishable using arbifrary counlerex- 
amples and instead, if relies on fhe verifier keeping a record of fhe largesl posifive example seen so far 
and reslricling counterexamples fo Ihose below fhe largesl positive example. 


Lemma 5.5 There is a family of languages ££ such that, cegis cannot identify a language L in ^ but 
pbcegis can identify L, that is, pbcegis ^ cegis. 


Proof Consider fhe language 


l32 = {3T2'|7G{0,1},/gN} 
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where 3^2' is a natural number obtained by taking the produet of 3 raised to the power of j and 2 raised 
to the power of i. is a set of these natural numbers. We now eonstruet a family of languages whieh 
are finite subsets of and have at least one member of the form 3.2', that is, 

^32 ^ ^ ^ is finite and 3k s.t. 3.2^ e Lp} 

We now eonsider the language 

L2 = {2'|/eN} 

Now, let be the family of languages sueh that the smallest element member in the language is the 
same as the index of the language, that is, 

= {L^i\i G N,L? C L^,L? is infinite and min(Lp = 2'} 

Now, we eonsider the following family of languages below. 

Language Family 3 


We refer to this language as .if in rest of the proof for brevity. We show that there is a language L in .if 
sueh that the language L eannot be identified by cegis buf pbcegis ean idenfify any language in .if. 

The key infuifion is as follows. If fhe examples seen by synfhesis algorifhm fill some iferafion i are 
all of fhe form 2^, then any synthesis teehnique eannot differentiate whether the language belongs to 
if32 or .if^. If the language belongs to the synthesis engine would eventually obtain an example 
of the form 3.2^ (sinee eaeh language in .if^^ has at least one element of this kind and these languages 
are finite). While the synthesis teehnique using arbitrary eounterexamples eannot reeover the previous 
examples, the teehniques with aeeess to the verifier whieh produees posifive bounded eounferexamples 
ean reeover all fhe previous examples. 

We now speeify a Tpbcegis whieh ean identify languages in .if. The synfhesis approaeh works in fwo 
possible steps. 

• Until an example 3.2^ is seen by fhe synfhesis engine, lef 2' be fhe smallesf member elemenf seen so 
far in fhe franseripf, fhe learner proposes L, as fhe language. If fhe large! language L G .if^, fhe learner 
would evenfually identify fhe language sinee fhe minimal elemenf will show up in fhe franseripf. If fhe 
largel language L G fhen evenfually, an example of fhe form 3.23 ■^^111 be seen sinee L musf have 
one sueh member elemenf. And after sueh an example is seen in fhe franseripf, fhe synfhesis engine 
moves lo seeond step. 

• Afler an example of fhe form 3.23 is seen, fhe synfhesis engine ean now be sure lhal fhe language 
belongs lo .if 3^ and is finite. Now, fhe learner ean diseover all fhe positive examples seen so far using 
fhe following Iriek. We firs! diseover fhe upper bound Bp on posifive examples seen so far. 

Bp = minimum k sueh lhal PBCHECKi({3*'},T[n]) relurns _L for k = 2,3,... 

Reeall lhal 3^,k = 2,3,... are nol in fhe largel language sinee Ihey are nol in any of fhe languages in 
fhe if lo whieh fhe large! language belongs. PBCHECK^ will relurn fhe only elemenf 3^ in fhe proposed 
eandidale language as a eounlerexample as long as Ihere is some posifive example 2' seen previously 
sueh lhal 2' > 3^. So, 3^’’ is fhe upper bound on all fhe posifive examples seen so far. The learner 
ean now eonslruel singleton languages {23} for j = 0, !,.../ sueh lhal 2' < 3^-. If a eounlerexample 
is relumed by PBCHECKi({2'},T[n]) fhen 2' is nol in fhe largel language. If no eounlerexample is 
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returned, then 2‘ is in the target language. This allows the synthesis engine to recover all the positive 
examples seen previously in finite steps. As we recover the positive examples, we run a Gold style 
algorithm for identifying finite languages |[28l to converge to the correct language. Thus, the learner 
would identify the correct language using finite memory. 

We now prove that cegis does not identify this family of languages. Let us assume that .if G cegis. 
So, there is a synthesis engine Tcegis which can identify all languages in .if. So, Tcegis niust converge 
to any language Li G .if^ after some finite transcript Let us consider an extension 1^2'” of such 
that 2'” G Li and 2™ 0 SAMPLE(Ti). Such an element 2'” exists since Zs is a finite transcript and L\ is an 
infinite language. Since the learner converges to Li starting from the initial language Lq after consuming 
Zs, learn(Lo,Ti2”’, cex') = learn(Lo,T^, cex). 

Let us consider two transcripts Ti2™(3.2^)_L“ and Ti(3.2^)_L® where _L“ denotes repeating _L in¬ 
finitely in the rest of the transcript. We know that learn(Lo,Ti2'”, cex') = learn(Lo, cex) = Li 
and thus, learn(Ti2'”(3.2^)_L®,cex') = learn(Ti(3.2P)_L®,cex) = learn(Li,(3.2P)_L“,cex"). So, 
the synthesis engine would behave exactly the same for both transcripts, and if it converges to a language 
L 2 on one transcript, it would converge to the same language on the other transcript. But the two tran¬ 
scripts are clearly from two different languages in . One of the transcripts corresponds to the finite 
language SAMPLE(Ti) U {3.2^} and the other corresponds to SAMPLE(Ti) U {2“, 3.2^}. This is a contra¬ 
diction and hence, there is no synthesis engine using arbitrary counterexamples Tcegis that can identify 
all languages in .if. 


5.1.3 Different Flavors of Bounded Counterexamples 

Finally, we compare pbcegis and cbcegis and show that they are not contained in each other. 

Theorem 5.6 The power of synthesis techniques using bounded counterexamples is neither less nor 
more than the techniques using positive bounded counterexamples, that is, cbcegis ^ pbcegis and 
pbcegis % cbcegis. 

Proof We consider two languages considered in previous proofs and show that the languages corre¬ 
sponding to one of them can only be identified by pbcegis while the languages corresponding to the 
other can only be identified by cbcegis. 

Consider the language family [T](.if„ofcZ)) formed by lower bounding the elements by some fixed con¬ 
stant, that is, ^notcb = {bi\i > B and L,- = {n\n G N A n > /}} where B is a fixed integer constant. We 
have proved in Theorem |5^ that a synthesis engine Tcbcegis cannot identify all languages in S^notcb- On 
the other hand, any counterexample is smaller than all positive examples in any language in ^notcb- So, a 
verifier producing positive bounded counterexample behaves similar to an arbitrary counterexample ver¬ 
ifier since any positive example is larger than all negative examples. Thus, Tcegis can identify languages 
in this language class. So, pbcegis ^ cbcegis. 

Now, consider the family of languages consisting of these, that is. 

Language Family 4 S^cbnotpb = {T,|f < B} where Li = {n\n G N An < /} 

This is a slight variant of the language class considered in proving Tcegis to be more powerful than 
Tpbcegis where we have restricted the class of languages to be a finite set. As stated earlier, PBCHECK 
does not produce any counterexample for these languages since all positive examples are smaller than 
any counterexample. But CBCHECK can be used to identify languages in this class by selecting the bound 
of the counterexamples to be B. Since, the counterexamples are at most of size B for these languages, 
a bounded counterexample verifier behaves exactly like an arbitrary counterexample producing verifier. 
Thus, cbcegis ^ pbcegis. ■ 
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5.2 Infinite Memory Inductive Synthesis 


We now consider the case where the inductive learning engine has infinite unbounded memory. This case 
is simpler than the one considered earlier with finite memory bound on the inductive learning engine and 
most of the results presented here follow from the results proved for the finite memory case. For brevity 
of space, we only give proof sketches highlighting the difference from the finite memory case. 


1. The proof of Theorem 5.1 works even when we replace the inductive learning engine using finite 
memory with the one using infinite memory. Further, the minimal counterexample can still be used 
as an aribitrary counterexample. And so, MINCEGIS = CEGIS. 


2. Next, we show that CBCEGIS C CEGIS. Consider an arbitrary but fixed constant B. For this B, 
consider all verifiers CBCHECK that only produce counterexamples bounded by B. We wish to ar¬ 
gue that any infinite memory learner LEARN that can converge to a target language Lc using any 
CBCHECK can also do so using CHECK. The basic idea is as follows: since LEARN has infinite 
memory, it can make extra queries to CHECK to obtain counterexamples bounded by B and learns 
only from those. Suppose at some step it received a counterexample x bigger than B for candidate 
language L. Then LEARN constructs a new candidate language L' that excludes x but otherwise 
agrees with It then queries CHECK with this new candidate L', and iterates the process until 
a counterexample less than B is received (which must happen if such a counterexample exists). 
LEARN uses its infinite-size memory to construct candidate languages that keep track of a poten¬ 
tially unbounded number of counterexamples bigger than B. Thus, LEARN uses this procedure to 
convert any CHECK into some CBCHECK. Since CBCEGIS comprises all language families learnable 
by LEARN given any CBCHECK, these language families are also learnable by LEARN using CHECK. 
Therefore, CBCEGIS C CEGIS. 


3. We now sketch the proof for PBCEGIS C CEGIS. The argument is similar to the previous case. 
Since the learner has infinite memory, it can store all the positive examples seen so far. Moreover, 
similar to the case of CBCEGIS, it can construct a stream of candidate languages to query CHECK 
so as to obtain positive history bounded counterexamples, as follows. It queries CHECK to obtain 
an arbitrary counterexample. If this is smaller than the largest positive example in stored positive 
examples, then the learner uses this example for proposing the next hypothesis language. If this 
counterexample is larger that the largest positive example, it constructs a new candidate language 
by excluding this counterexample from the previous candidate language, and again queries CHECK 
to obtain a new counterexample. This continues until the learner can get a positive history bounded 
counterexample or there is no such counterexample. Thus, the learner now uses only positive 
history bounded counterexamples, and hence, Tcegis can identify any language that Tpecegis can 
identify. 


We now present three languages used previously in proofs for inductive learning engines using finite 
memory, and show how these languages allow us to distinguish relative power of synthesis engines. 

1. Consider the language family ^notcb = B andL, = {n\n G N An > /}} where B is 

a constant bound. The argument in Theorem |5.2| also holds for the infinite memory synthesis 
engines, and so, ^notcb £ CBCEGIS n CEGIS. 

Further, a positive history bounded verifier will always return a counterexample if one exists since 
all counterexamples are smaller than any positive example in the language. Thus, Tpbcegis can also 
identify languages in ^„otcb- Thus, ^„otcb £ CBCEGIS n PBCEGIS. 


^We can do this as we have a finite representation of L (e.g., in the form of its characteristic function) and can modify this 
to initially check if the input is x, and if so, to report that this is not in the modified language. 
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2. Consider the language family|^ -^notpb = £ N} where 

Li = {n|?i G N An < /} 


As argued in the proof of Theorem 5.3 the verifier producing positive bounded counterexamples 
will not report any counterexample for any of the languages in ^notpb because all counterexamples 
are larger than any positive example. So, languages in this family cannot be identified by Tpbcegis 
buf fhese can be identified using Tcegis- So, ^notpb £ PBCEGIS n CEGIS. 

3. Consider fhe finife language family -^cbnotpb = where 

Li = {n\n G N An < /} 


As argued in proof of Theorem |5. 6 [ fhe verifier PBCHECK does nof produce any counferexample for 
fhese languages since all posifive examples are smaller fhan any counferexample. Buf CBCHECK 
can be used fo identify languages in fhis class by selecting fhe bound fo be B. Since, fhe coun- 
ferexamples are af mosf of size B for fhese languages, a bounded counferexample verifier behaves 
exacfly like an arbifrary counterexample producing verifier. Thus, ^cbnotpb £ PBCEGIS nCBCEGIS. 

We now summarize fhe resulfs described in fhis section below. For finife memory learners, cbcegis C 
mincegis = cegis, pbcegis and cegis are nof comparabale, fhaf is, pbcegis % cegis and pbcegis 2 
cegis. cbcegis and pbcegis are also nof comparable. In case of infinite memory learners, CBCEGIS C 
MINCEGIS = CEGIS, and PBCEGIS C CEGIS = MINCEGIS. CBCEGIS and PBCEGIS are again nof compa¬ 
rable. The resulfs are summarized in Figure]^ 



Figure 2: Summary of Resulfs on Decidabilify of Synfhesis for Infinife Language Classes 


6 Analysis of OGIS for Finite Language Classes 

We now discuss fhe case when fhe class of candidafe programs (languages) has finite cardinalify. As in 
Sec. 1^ rafher fhan referring fo programs we will refer fo synfhesizing fhe languages identified by fhose 
programs. If fhe language class is finite fhen fhere exisfs a terminating OGIS procedure, e.g., one fhaf 
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simply enumerates languages from this class until one satisfying the specification <I> is obtained. More¬ 
over, any implementation of OGIS which uses an oracle that provides new (positive/negative) examples 
in every iteration ruling out at least one candidate language will terminate with the correct language. The 
counterexample guided inductive synthesis approach Il55]l for bitvector sketches and oracle guided induc¬ 
tive synthesis using distinguishing inputs |[30ll for programs composed of a finite library of components 
are examples of OGIS synthesis techniques applied to finite language classes. We analyze the complexity 
of synthesis for finite language classes and discuss its relation to the notion of teaching dimension from 
the concept learning literature ll20ll . This connection between synthesis of languages from finite classes 
and teaching of concepts was first discussed in . Here we establish that the size of the smallest set of 
examples for language (program) synthesis is bounded below by the teaching dimension of the concept 
class corresponding to the class of languages. 

6.1 NP-hardness 

We measure efficiency of an OGIS synfhesis engine using fhe nofion of sample complexity mentioned 
in Sec.|^— fhe number of queries (and responses) needed fo correcfly idenfify a language. In order fo 
analyze sample complexify, we need fo fix fhe nafure of queries fo fhe oracle. We focus on queries fo 
which fhe oracle provides an example or counferexample in response. We show fhaf finding fhe minimal 
sef of examples fo be provided by fhe oracle such fhaf fhe synfhesis engine converges fo fhe correcf 
language is NP-hard. 

Theorem 6.1 Solving the formal inductive synthesis problem /or a finite ^ and finite E 

with the minimum number of queries is NP-hard for any oracle interface G comprising the correctness 
query qcorr (cmd possibly qposwit and qnegwit). 

Proof We prove NP-hardness fhrough reducfion from fhe minimum sef cover problem. Consider fhe 
minimum sef cover problem wifh k sefs 81 , 82 , - ■■ ,Sk and a universe comprising ni elemenfs x\,X 2 , ■ ■ ■ ,Xm 
which needs fo be covered using fhe sefs. We reduce if fo a formal inductive synfhesis problem E, <I>, G) 

where 'G = {Li,L 2 ,... is a sef of m-|-1 languages, E = {ei,e 2 ,- ■ ■ ,0/t} is the domain compris¬ 
ing k examples over which the languages are defined and <I> = } is fhe specificafion. Infuifively, fhe 

m languages Li,... are associafed fo fhe m elemenfs in fhe sef cover problem. The k examples corre¬ 
spond fo fhe k sefs. The sefs L\,L 2 ,- ■■ ,Lnt+i are consfrucfed as follows: For all 1 < / < k and I < j <m, 
example 0 , belongs fo fhe symmefric difference of Lj and if and only if fhe sef S, confains elemenf 
Xj. We can do fhis, for insfance, by including ei in Lj buf nol in L„,+i. 

Consider fhe operation of an OGIS procedure implementing an G confaining <7corr- Every unsuccess¬ 
ful correcfness query refurns a counferexample which is an elemenf of E in fhe symmefric difference 
of fhe proposed Lj and L^+i- Lef be fhe smallesf sef of counferexamples fhaf uniquely 

idenfifies fhe correcf language L„,+i. So, for all 1 < y < ni, fhere exisfs some // such fhaf eifher e,, G Lj 
or e/, G L,n+i buf nol bolh. And so, for all \ < j <m, fhere exisfs some // such fhaf Xj G 8^ where 
ii G {i\,i 2 ,---,in}- Moreover, dropping // resulls in some Xj nof being covered (fhe corresponding Lj 
is nol distinguished from L„,+i). Thus, Si^,Si^,... ,Si^ is a solution fo fhe minimum sef cover problem 
which is known fo be NP-complele. Similarly, if is easy fo see fhaf any solution fo fhe minimum sef cover 
problem also yields a minimum counferexample sef. 

We can Iherefore conclude fhaf solving fhe formal inductive synfhesis problem G) wifh fhe 

minimum number of queries is NP-hard. 


We note fhaf fhis proof applies fo any FIS problem wifh an oracle inferface G confaining fhe cor¬ 
recfness query q^orc- Moreover, fhis proof can be easily exlended fo olher oracle inlerfaces as well. 
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such as the version of the distinguishing input method that does not use the correctness query, with 
^ — {^wit’^diffj^mem}- In this latter case, the combined use of and ^mem yields the desired mapping. 

6.2 Relation to Teaching Dimension 

Goldman et al. EOllHl proposed teaching dimension as a measure to study computational complexity of 
learning. They consider a teaching model in which a helpful teacher selects the examples of the concept 
and provides it to the learner. Informally, the teaching dimension of a concept class is the minimum 
number of examples that a teacher must reveal to uniquely identify any target concept chosen from the 
class. 

For a domain E and concept class a concept c G ^ is a set of examples from E. So, ^ C 2^. In the 
learning model proposed by Goldman et al. ll20ll2T]l . the basic goal of the teacher is to help the learner 
identify the target concept c* G ^ by providing an example sequence from E. We now formally define 
the teaching dimension of a concept class. 

Definition 6.1 (adapted from / I20I/ ) An example sequence is a sequence of labeled examples from E, 
where the labels are given by some underlying specification. For concept class and target concept 
c G we say T is a teaching sequence for c (in ‘if) if T is an example sequence that uniquely identifies 
c in - that is, c is the only concept in consistent with T. Let T (c) denote the set of all teaching 
sequences for c. Teaching dimension TD(‘f’) of the concept class is defined as follows: 

rD(^) = max ( min ItI) 

T67'(c) 

Consider an FIS problem where the specification is complete, i.e., <I> = {Lc}. Consider an instance 
of OGIS using any combination of witness, equivalence, subsumption, or distinguishing input queries. 
Each of these queries, if it does not terminate the OGIS loop, returns a new example for the learner. Thus, 
the number of iterations of the OGIS loop, its sample complexity, is the number of examples needed by 
the learner to identify a correct language. Suppose the minimum such number of examples, for any 
specification (target language Lc G ^), is Mogis('^)- Then, the following theorem must hold. 

Theorem 6.2 Mogis(^) > TD{‘if) 

The theorem can be obtained by a straightforward proof by contradiction: if Mogis(^) < TD{‘if), then 
for each target concept to be learned, there is a shorter teaching sequence than TDff), viz., the one used 
by the OGIS instance for that target, contradicting the definition of teaching dimension. 

Now, given that the teaching dimension is a lower bound on the sample complexity of OGIS, it is 
natural to ask how large TDff) can grow in practice. This is still a largely open question for general 
language classes. However, results from machine learning theory can help shed more light on this ques¬ 
tion. One of these results relates the teaching dimension to a second metric for measuring complexity of 
learning, namely the Vapnik-Chervonenkis (VC) dimension Il60ll . We define this below. 

Definition 6.2 MOI/ Let E be the domain of examples and c be a concept from the class A finite set 
E' C E is shattered by If if {cFFJf G =2f . In other words, E' C E is shattered by If if for each 
subset E" C E', there is a concept c G which contains all ofF", but none of the instances in E' — E". 
The Vapnik-Chervonenskis (VC) dimension is defined to be smallest dfor which no set ofd + 1 examples 
is shattered by 

Blumer et al. |[T^ have shown that the VC dimension of a concept class characterizes the num¬ 
ber of examples required for learning any concept in the class under the distribution-free or probably 
approximately correct (PAG) model of Valiant 15^ . The differences between teaching dimension and 
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Vapnik-Chervonenkis dimension are discussed at length by Goldman and Kearns EOl . The following 
theorems from EOl provides lower and upper bound on the teaching dimension of a finite concept class 
in terms of the size of the concept class and its VC-dimension. 

Theorem 6.3 / I20I/ The teaching dimension TD{'W) of any concept class ^ satisfies the following upper 
and lower bounds: 

VC{^)/\og{\^\) < TD{^) <1^1-1 

where VCf^) is the VC dimension of the concept class and denotes the number of concepts in the 

concept class. 

Moreover, Goldman and Kearns EOll exhibit a concept class for which the upper bound is tight. This 
indicates that without restrictions on the concept class, one may not be able to prove very strong bounds 
on the sample complexity of OGIS. 

To summarize, we have shown that solving the formal inductive synthesis problem for finite domains 
and finite concept classes with the minimum number of queries is NP-hard. Further, we showed that 
the combinatorial measure of teaching dimension captures the smallest number of examples required to 
identify the correct language. 

7 Conclusion 

We presented a theoretical framework and analysis of formal inductive synthesis by formalizing the 
notion of oracle-guided inductive synthesis (OGIS). We illustrated how OGIS generalizes instances of 
concept learning in machine learning as well as synthesis techniques developed using formal methods. 
We focus on counterexample-guided inductive synthesis (CEGIS) which is an OGIS implementations 
that uses the verification engine as the oracle. We presented different variations of cegis motivated by 
practice, and showed that their synthesis power can be different, especially when the learning engine 
can only store a bounded number of examples. There are several directions for future work. We discuss 
some open problems below that would further improve the theoretical understanding of formal inductive 
synthesis. 

• Teaching dimension of concept classes such as decision trees and axis parallel rectangles have been 
well-studied in literature. But teaching dimension of formal concept classes such as programs in 
the while 16^ language with only linear arithmetic over integers is not known. Finding teaching 
dimensions for these classes would help in establishing bounds on the number of examples needed for 
synthesizing programs from these classes. 

• We investigated the difference in synthesis power when the learning engine has finite memory vs when 
the learning engine has infinite memory. Another important question to consider is how the power of 
the synthesis engine changes when we restrict the time complexity of learning engine such as the 
learning engines which take time polynomial in the number of examples. 

• We have not analyzed the impact of different learning strategies that may traverse the space of possible 
programs (languages) in various ways. This is also an interesting avenue for future work. 

In summary, our paper is a first step towards a theory of formal inductive synthesis, and much remains 
to be done to improve our understanding of this emerging area with several practical applications. 
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